{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "homeless-lindsay",
   "metadata": {},
   "source": [
    "# 岭回归"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "disabled-disco",
   "metadata": {},
   "outputs": [],
   "source": [
    "##keep\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "portable-country",
   "metadata": {},
   "source": [
    "## 生成数据集"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "weekly-colonial",
   "metadata": {},
   "outputs": [],
   "source": [
    "##keep\n",
    "from sklearn.datasets import make_regression\n",
    "data, target = make_regression(n_samples = 2000, n_features = 3, effective_rank = 2, noise = 10)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "revised-corps",
   "metadata": {},
   "source": [
    "## 自定义函数实现bootstrap\n",
    "从data中反复抽取样本，保存模型的系数并返回。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "random-sydney",
   "metadata": {},
   "outputs": [],
   "source": [
    "##keep\n",
    "def bootstrap(model):\n",
    "    global data, target\n",
    "    n = 1000\n",
    "    sample_size = int(0.5 * len(target))\n",
    "    subsample = lambda: np.random.choice(np.arange(0, len(target)), size = sample_size)\n",
    "    coefs_ = np.ones((n, 3))\n",
    "    for i in range(n):\n",
    "        idxs = subsample()\n",
    "        X, y = data[idxs], target[idxs]\n",
    "        model.fit(X, y)\n",
    "        coefs_[i][0] = model.coef_[0]\n",
    "        coefs_[i][1] = model.coef_[1]\n",
    "        coefs_[i][2] = model.coef_[2]\n",
    "    return coefs_"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dated-lawsuit",
   "metadata": {},
   "source": [
    "## 自定义函数画出模型系数直方图\n",
    "参数为模型的系数，画出每个系数的直方图，要求共享x轴。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "moving-success",
   "metadata": {},
   "outputs": [],
   "source": [
    "##keep\n",
    "def plot_coefs(coefs):\n",
    "    plt.figure(figsize = (14, 9))\n",
    "    ax1 = plt.subplot(311, title = 'coef_[0]')\n",
    "    ax1.hist(coefs[:, 0], bins = 50)\n",
    "    ax2 = plt.subplot(312, sharex = ax1, title = 'coef_[1]')\n",
    "    ax2.hist(coefs[:, 1], bins = 50)\n",
    "    ax3 = plt.subplot(313, sharex = ax1, title = 'coef_[2]')\n",
    "    ax3.hist(coefs[:, 2], bins = 50)\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "advanced-friend",
   "metadata": {},
   "source": [
    "## 线性回归系数分布\n",
    "使用不带参数的LinearRegression来适配数据集，并画出三个系数的直方图。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "processed-server",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "signed-elements",
   "metadata": {},
   "source": [
    "## 岭回归系数的分布\n",
    "用岭回归来适配数据集，并画出三个系数的直方图。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ancient-worst",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "prime-enzyme",
   "metadata": {},
   "source": [
    "## 比较两者的方差"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "medical-romania",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "compound-sucking",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "simplified-matter",
   "metadata": {},
   "source": [
    "## 用RidgeCV找到最佳的alpha参数\n",
    "使用如下参数列表：\n",
    "```\n",
    "alphas = np.linspace(0.01, 1)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "atomic-convergence",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "7ee2831f",
   "metadata": {},
   "source": [
    "查看alpha_属性值："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "crazy-investigation",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "166b631e",
   "metadata": {},
   "source": [
    "查看cv_values_的形状："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "center-linux",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "instant-lounge",
   "metadata": {},
   "source": [
    "## 画出交叉验证过程中cv_values_的均值曲线\n",
    "fit()调用后，cv_values_的值默认情况下为均方误差。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "outdoor-environment",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "catholic-conclusion",
   "metadata": {},
   "source": [
    "## 用最佳参数重建模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "exempt-fellow",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "44f847d5",
   "metadata": {},
   "source": [
    "计算方差："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "amber-visitor",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "complicated-wallace",
   "metadata": {},
   "source": [
    "## 比较两个误差\n",
    "首先计算普通Ridge对象的均方误差："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "simplified-smith",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "6396747f",
   "metadata": {},
   "source": [
    "计算使用最佳参数的Ridge对象的均方误差："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "prostate-applicant",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
